Rows: 1988 Columns: 28
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (13): category, medicine_name, therapeutic_area, common_name, active_su...
dbl (1): revision_number
lgl (8): patient_safety, additional_monitoring, generic, biosimilar, condi...
dttm (2): first_published, revision_date
date (4): marketing_authorisation_date, date_of_refusal_of_marketing_author...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
skimr::skim(drugs)
Data summary
Name
drugs
Number of rows
1988
Number of columns
28
_______________________
Column type frequency:
character
13
Date
4
logical
8
numeric
1
POSIXct
2
________________________
Group variables
None
Variable type: character
skim_variable
n_missing
complete_rate
min
max
empty
n_unique
whitespace
category
0
1.00
5
10
0
2
0
medicine_name
0
1.00
3
125
0
1976
0
therapeutic_area
285
0.86
4
400
0
669
0
common_name
4
1.00
4
220
0
1261
0
active_substance
1
1.00
4
823
0
1345
0
product_number
0
1.00
6
6
0
1932
0
authorisation_status
1
1.00
7
10
0
3
0
atc_code
28
0.99
3
18
0
1074
0
marketing_authorisation_holder_company_name
4
1.00
4
65
0
615
0
pharmacotherapeutic_group
34
0.98
7
174
0
365
0
condition_indication
12
0.99
18
7597
0
1886
0
species
1709
0.14
4
67
0
59
0
url
0
1.00
53
148
0
1988
0
Variable type: Date
skim_variable
n_missing
complete_rate
min
max
median
n_unique
marketing_authorisation_date
60
0.97
1995-10-20
2023-02-20
2013-06-09
1127
date_of_refusal_of_marketing_authorisation
1913
0.04
2004-09-07
2022-04-29
2013-04-25
67
date_of_opinion
779
0.61
1995-07-12
2022-12-15
2016-07-21
389
decision_date
45
0.98
1998-08-20
2023-03-10
2022-02-16
815
Variable type: logical
skim_variable
n_missing
complete_rate
mean
count
patient_safety
0
1
0.01
FAL: 1977, TRU: 11
additional_monitoring
0
1
0.19
FAL: 1601, TRU: 387
generic
0
1
0.16
FAL: 1673, TRU: 315
biosimilar
0
1
0.05
FAL: 1896, TRU: 92
conditional_approval
0
1
0.02
FAL: 1940, TRU: 48
exceptional_circumstances
0
1
0.02
FAL: 1940, TRU: 48
accelerated_assessment
0
1
0.02
FAL: 1940, TRU: 48
orphan_medicine
0
1
0.08
FAL: 1826, TRU: 162
Variable type: numeric
skim_variable
n_missing
complete_rate
mean
sd
p0
p25
p50
p75
p100
hist
revision_number
96
0.95
13.53
11.65
0
4.75
11
19
89
▇▃▁▁▁
Variable type: POSIXct
skim_variable
n_missing
complete_rate
min
max
median
n_unique
first_published
0
1.00
1998-08-20 00:00:00
2023-03-09 18:50:00
2018-04-14 23:14:30
1760
revision_date
29
0.99
2000-07-17 02:00:00
2023-03-13 11:52:00
2022-05-11 11:58:00
1932
Then we’ll take a look at the data using visdat::vis_dat()
visdat::vis_dat(drugs)
Warning: `gather_()` was deprecated in tidyr 1.2.0.
ℹ Please use `gather()` instead.
ℹ The deprecated feature was likely used in the visdat package.
Please report the issue at <]8;;https://github.com/ropensci/visdat/issueshttps://github.com/ropensci/visdat/issues]8;;>.
Drugs by Therapeutic Area
Not surprisingly, Diabetes Type 2 is the top therapeutic area, followed by HIV infections, and hypertension.